HLA Typing from 1000 Genomes Whole Genome and Whole Exome Illumina Data
نویسندگان
چکیده
Specific HLA genotypes are known to be linked to either resistance or susceptibility to certain diseases or sensitivity to certain drugs. In addition, high accuracy HLA typing is crucial for organ and bone marrow transplantation. The most widespread high resolution HLA typing method used to date is Sanger sequencing based typing (SBT), and next generation sequencing (NGS) based HLA typing is just starting to be adopted as a higher throughput, lower cost alternative. By HLA typing the HapMap subset of the public 1000 Genomes paired Illumina data, we demonstrate that HLA-A, B and C typing is possible from exome sequencing samples with higher than 90% accuracy. The older 1000 Genomes whole genome sequencing read sets are less reliable and generally unsuitable for the purpose of HLA typing. We also propose using coverage % (the extent of exons covered) as a quality check (QC) measure to increase reliability.
منابع مشابه
Evaluating the Coverage and Potential of Imputing the Exome Microarray with Next-Generation Imputation Using the 1000 Genomes Project
Next-generation genotyping microarrays have been designed with insights from large-scale sequencing of exomes and whole genomes. The exome genotyping arrays promise to query the functional regions of the human genome at a fraction of the sequencing cost, thus allowing large number of samples to be genotyped. However, two pertinent questions exist: firstly, how representative is the content of t...
متن کاملATHLATES: accurate typing of human leukocyte antigen through exome sequencing
Human leukocyte antigen (HLA) typing at the allelic level can in theory be achieved using whole exome sequencing (exome-seq) data with no added cost but has been hindered by its computational challenge. We developed ATHLATES, a program that applies assembly, allele identification and allelic pair inference to short read sequences, and applied it to data from Illumina platforms. In 15 data sets ...
متن کاملAccuracy of programs for the determination of HLA alleles from NGS data
33 34 The human leukocyte antigen (HLA) genes code for proteins that play a central role in the 35 function of the immune system by presenting peptide antigens to T cells. As HLA genes show 36 extremely high genetic polymorphism, HLA typing on the allele level is demanding and is based 37 on DNA sequencing. Determination of HLA alleles is warranted as many HLA alleles are major 38 genetic facto...
متن کاملDeep whole-genome sequencing of 90 Han Chinese genomes
Next-generation sequencing provides a high-resolution insight into human genetic information. However, the focus of previous studies has primarily been on low-coverage data due to the high cost of sequencing. Although the 1000 Genomes Project and the Haplotype Reference Consortium have both provided powerful reference panels for imputation, low-frequency and novel variants remain difficult to d...
متن کاملAccurate HLA Typing at High-Digit Resolution from NGS Data
Human leukocyte antigen (HLA) typing from next generation sequencing (NGS) data has the potential for applications in clinical laboratories and population genetic studies. Here we introduce a novel technique for HLA typing from NGS data based on read-mapping using a comprehensive reference panel containing all known HLA alleles and de novo assembly of the gene-specific short reads. An accurate ...
متن کامل